Fast Estimation of the Pattern Frequency Spectrum
نویسندگان
چکیده
Both exact and approximate counting of the number of frequent patterns for a given frequency threshold are hard problems. Still, having even coarse prior estimates of the number of patterns is useful, as these can be used to appropriately set the threshold and avoid waiting endlessly for an unmanageable number of patterns. Moreover, we argue that the number of patterns for different thresholds is an interesting summary statistic of the data: the pattern frequency spectrum. To enable fast estimation of the number of frequent patterns, we adapt the classical algorithm by Knuth for estimating the size of a search tree. Although the method is known to be theoretically suboptimal, we demonstrate that in practice it not only produces very accurate estimates, but is also very efficient. Moreover, we introduce a small variation that can be used to estimate the number of patterns under constraints for which the Apriori property does not hold. The empirical evaluation shows that this approach obtains good estimates for closed itemsets. Finally, we show how the method, together with isotonic regression, can be used to quickly and accurately estimate the frequency pattern spectrum: the curve that shows the number of patterns for every possible value of the frequency threshold. Comparing such a spectrum to one that was constructed using a random data model immediately reveals whether the dataset contains any structure of interest.
منابع مشابه
9 Power Spectrum and Correlation
he power spectrum reveals the existence, or the absence, of repetitive patterns and correlation structures in a signal process. These structural patterns are important in a wide range of applications such as data forecasting, signal coding, signal detection, radar, pattern recognition, and decision-making systems. The most common method of spectral estimation is based on the fast Fourier transf...
متن کاملThe Changes of Leg Musclus Activities Following Increase of Gait Velocity
Purpose: Motor control evaluation and analysis of it"s specifications for diagnosis of neuromuscular diseases is new approach in clinical electroneurophysiology, that is based on the changes of electromyography responses and classic reflexes in this field. In this study quantitative power spectrum frequency used for changes of motor control strategies. Materials and Methods: Twenty five health...
متن کاملEstimation of kinematic source parameters and frequency independent shear wave quality factor around Bushehr
In this paper, the shear wave quality factor and source parameters in the near field are estimated by analyzing the acceleration data in Zagros region. Accelerograms recorded by Building and Houses Research Center strong ground motion network have been used. The data have been considered with the magnitude of 4.7 to 6.3 collected from 1999 to 2014. In this approach, the theoretical S-wave displ...
متن کاملLarge-scale Inversion of Magnetic Data Using Golub-Kahan Bidiagonalization with Truncated Generalized Cross Validation for Regularization Parameter Estimation
In this paper a fast method for large-scale sparse inversion of magnetic data is considered. The L1-norm stabilizer is used to generate models with sharp and distinct interfaces. To deal with the non-linearity introduced by the L1-norm, a model-space iteratively reweighted least squares algorithm is used. The original model matrix is factorized using the Golub-Kahan bidiagonalization that proje...
متن کاملThe Effects of Changing Footstrike Pattern on the Amplitude and Frequency Spectrum of Ground Reaction Forces During Running in Individuals With Pronated Feet
Background: The current study aimed to evaluate the effects of barefoot and shod running with two different styles on ground reaction force-frequency content in recreational runners with low arched feet. Methods: The statistical sample of this research was 13 males with PF (mean±SD age: 26.2±2.8 y; height: 176.1±8.4 cm; weight: 78.3±14.3 kg). A force plate (Bertec, USA) with a sample rate of 1...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014